23 research outputs found

    IMPUTING OR SMOOTHING? MODELLING THE MISSING ONLINE CUSTOMER JOURNEY TRANSITIONS FOR PURCHASE PREDICTION

    Get PDF
    Online customer journeys are at the core of e-commerce systems and it is therefore important to model and understand this online customer behaviour. Clickstream data from online journeys can be modelled using Markov Chains. This study investigates two different approaches to handle missing transition probabilities in constructing Markov Chain models for purchase prediction. Imputing the transition probabilities by using Chapman-Kolmogorov (CK) equation addresses this issue and achieves high prediction accuracy by approximating them with one step ahead probability. However, it comes with the problem of a high computational burden and some probabilities remaining zero after imputation. An alternative approach is to smooth the transition probabilities using Bayesian techniques. This ensures non-zero probabilities but this approach has been criticized for not being as accurate as the CK method, though this has not been fully evaluated in the literature using realistic, commercial data. We compare the accuracy of the purchase prediction of the CK and Bayesian methods, and evaluate them based on commercial web server data from a major European airline

    Representation Disparity-aware Distillation for 3D Object Detection

    Full text link
    In this paper, we focus on developing knowledge distillation (KD) for compact 3D detectors. We observe that off-the-shelf KD methods manifest their efficacy only when the teacher model and student counterpart share similar intermediate feature representations. This might explain why they are less effective in building extreme-compact 3D detectors where significant representation disparity arises due primarily to the intrinsic sparsity and irregularity in 3D point clouds. This paper presents a novel representation disparity-aware distillation (RDD) method to address the representation disparity issue and reduce performance gap between compact students and over-parameterized teachers. This is accomplished by building our RDD from an innovative perspective of information bottleneck (IB), which can effectively minimize the disparity of proposal region pairs from student and teacher in features and logits. Extensive experiments are performed to demonstrate the superiority of our RDD over existing KD methods. For example, our RDD increases mAP of CP-Voxel-S to 57.1% on nuScenes dataset, which even surpasses teacher performance while taking up only 42% FLOPs.Comment: Accepted by ICCV2023. arXiv admin note: text overlap with arXiv:2205.15156 by other author

    Sampling from Gaussian Process Posteriors using Stochastic Gradient Descent

    Full text link
    Gaussian processes are a powerful framework for quantifying uncertainty and for sequential decision-making but are limited by the requirement of solving linear systems. In general, this has a cubic cost in dataset size and is sensitive to conditioning. We explore stochastic gradient algorithms as a computationally efficient method of approximately solving these linear systems: we develop low-variance optimization objectives for sampling from the posterior and extend these to inducing points. Counterintuitively, stochastic gradient descent often produces accurate predictions, even in cases where it does not converge quickly to the optimum. We explain this through a spectral characterization of the implicit bias from non-convergence. We show that stochastic gradient descent produces predictive distributions close to the true posterior both in regions with sufficient data coverage, and in regions sufficiently far away from the data. Experimentally, stochastic gradient descent achieves state-of-the-art performance on sufficiently large-scale or ill-conditioned regression tasks. Its uncertainty estimates match the performance of significantly more expensive baselines on a large-scale Bayesian optimization task

    Beyond Intuition, a Framework for Applying GPs to Real-World Data

    Full text link
    Gaussian Processes (GPs) offer an attractive method for regression over small, structured and correlated datasets. However, their deployment is hindered by computational costs and limited guidelines on how to apply GPs beyond simple low-dimensional datasets. We propose a framework to identify the suitability of GPs to a given problem and how to set up a robust and well-specified GP model. The guidelines formalise the decisions of experienced GP practitioners, with an emphasis on kernel design and options for computational scalability. The framework is then applied to a case study of glacier elevation change yielding more accurate results at test time.Comment: Accepted at the 1st ICML Workshop on Structured Probabilistic Inference and Generative Modelling (2023

    Stochastic Gradient Descent for Gaussian Processes Done Right

    Full text link
    We study the optimisation problem associated with Gaussian process regression using squared loss. The most common approach to this problem is to apply an exact solver, such as conjugate gradient descent, either directly, or to a reduced-order version of the problem. Recently, driven by successes in deep learning, stochastic gradient descent has gained traction as an alternative. In this paper, we show that when done right\unicode{x2014}by which we mean using specific insights from the optimisation and kernel communities\unicode{x2014}this approach is highly effective. We thus introduce a particular stochastic dual gradient descent algorithm, that may be implemented with a few lines of code using any deep learning framework. We explain our design decisions by illustrating their advantage against alternatives with ablation studies and show that the new method is highly competitive. Our evaluations on standard regression benchmarks and a Bayesian optimisation task set our approach apart from preconditioned conjugate gradients, variational Gaussian process approximations, and a previous version of stochastic gradient descent for Gaussian processes. On a molecular binding affinity prediction task, our method places Gaussian process regression on par in terms of performance with state-of-the-art graph neural networks

    Deep-Learning-Enabled Fast Optical Identification and Characterization of Two-Dimensional Materials

    Full text link
    Advanced microscopy and/or spectroscopy tools play indispensable role in nanoscience and nanotechnology research, as it provides rich information about the growth mechanism, chemical compositions, crystallography, and other important physical and chemical properties. However, the interpretation of imaging data heavily relies on the "intuition" of experienced researchers. As a result, many of the deep graphical features obtained through these tools are often unused because of difficulties in processing the data and finding the correlations. Such challenges can be well addressed by deep learning. In this work, we use the optical characterization of two-dimensional (2D) materials as a case study, and demonstrate a neural-network-based algorithm for the material and thickness identification of exfoliated 2D materials with high prediction accuracy and real-time processing capability. Further analysis shows that the trained network can extract deep graphical features such as contrast, color, edges, shapes, segment sizes and their distributions, based on which we develop an ensemble approach topredict the most relevant physical properties of 2D materials. Finally, a transfer learning technique is applied to adapt the pretrained network to other applications such as identifying layer numbers of a new 2D material, or materials produced by a different synthetic approach. Our artificial-intelligence-based material characterization approach is a powerful tool that would speed up the preparation, initial characterization of 2D materials and other nanomaterials and potentially accelerate new material discoveries

    Boosting the Electrocatalytic Activity of Nickel-Iron Layered Double Hydroxide for the Oxygen Evolution Reaction byTerephthalic Acid

    No full text
    The development of a new type of oxygen evolution reaction (OER) catalyst to reduce the energy loss in the process of water electrolysis is of great significance to the realization of the industrialization of hydrogen energy storage. Herein, we report the catalysts of NiFe double-layer hydroxide (NiFe-LDH) mixed with different equivalent terephthalic acid (TPA), synthesized by the hydrothermal method. The catalyst synthesized with the use of the precursor solution containing one equivalent of TPA shows the best performance with the current density of 2 mA cm−2 at an overpotential of 270 mV, the Tafel slope of 40 mV dec−1, and excellent stable electrocatalytic performance for OER. These catalysts were characterized in a variety of methods. X-ray diffraction (XRD), Fourier Transform Infrared Spectrometer (FTIR), and Raman spectrum proved the presence of TPA in the catalysts. The lamellar structure and the uniform distribution of Ni and Fe in the catalysts were observed by a scanning electron microscope (SEM) and a transmission electron microscope (TEM). In X-ray photoelectron spectroscopy (XPS) of NiFe-LDH with and without TPA, the changes in the peak positions of Ni and Fe spectra indicate strong electronic interactions between TPA and Ni and Fe atoms. These results suggest that a certain amount of TPA can boost catalytic activity

    Latent Derivative Bayesian Last Layer Networks

    No full text
    Bayesian neural networks (BNN) are powerful parametric models for nonlinear regression with uncertainty quantification. However, the approximate inference techniques for weight space priors suffer from several drawbacks. The 'Bayesian last layer' (BLL) is an alternative BNN approach that learns the feature space for an exact Bayesian linear model with explicit predictive distributions. However, its predictions outside of the data distribution (OOD) are typically overconfident, as the marginal likelihood objective results in a learned feature space that overfits to the data. We overcome this weakness by introducing a functional prior on the model's derivatives w.r.t. the inputs. Treating these Jacobians as latent variables, we incorporate the prior into the objective to influence the smoothness and diversity of the features, which enables greater predictive uncertainty. For the BLL, the Jacobians can be computed directly using forward mode automatic differentiation, and the distribution over Jacobians may be obtained in closed-form. We demonstrate this method enhances the BLL to Gaussian process-like performance on tasks where calibrated uncertainty is critical: OOD regression, Bayesian optimization and active learning, which include high-dimensional real-world datasets.Peer reviewe

    The production of iron oxide during peridotite serpentinization: Influence of pyroxene

    No full text
    Serpentinization produces molecular hydrogen (H2) that can support communities of microorganisms in hydrothermal fields; H2 results from the oxidation of ferrous iron in olivine and pyroxene into ferric iron, and consequently iron oxide (magnetite or hematite) forms. However, the mechanisms that control H2 and iron oxide formation are poorly constrained. In this study, we performed serpentinization experiments at 311 °C and 3.0 kbar on olivine (with <5% pyroxene), orthopyroxene, and peridotite. The results show that serpentine and iron oxide formed when olivine and orthopyroxene individually reacted with a saline starting solution. Olivine-derived serpentine had a significantly lower FeO content (6.57 ± 1.30 wt.%) than primary olivine (9.86 wt.%), whereas orthopyroxene-derived serpentine had a comparable FeO content (6.26 ± 0.58 wt.%) to that of primary orthopyroxene (6.24 wt.%). In experiments on peridotite, olivine was replaced by serpentine and iron oxide. However, pyroxene transformed solely to serpentine. After 20 days, olivine-derived serpentine had a FeO content of 8.18 ± 1.56 wt.%, which was significantly higher than that of serpentine produced in olivine-only experiments. By contrast, serpentine after orthopyroxene had a slightly higher FeO content (6.53 ± 1.01 wt.%) than primary orthopyroxene. Clinopyroxene-derived serpentine contained a significantly higher FeO content than its parent mineral. After 120 days, the FeO content of olivine-derived serpentine decreased significantly (5.71 ± 0.35 wt.%), whereas the FeO content of orthopyroxene-derived serpentine increased (6.85 ± 0.63 wt.%) over the same period. This suggests that iron oxide preferentially formed after olivine serpentinization. Pyroxene in peridotite gained some Fe from olivine during the serpentinization process, which may have led to a decrease in iron oxide production. The correlation between FeO content and SiO2 or Al2O3 content in olivine- and orthopyroxene-derived serpentine indicates that aluminum and silica greatly control the production of iron oxide. Based on our results and data from natural serpentinites reported by other workers, we propose that aluminum may be more influential at the early stages of peridotite serpentinization when the production of iron oxide is very low, whereas silica may have a greater control on iron oxide production during the late stages instead
    corecore